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This is in response to the appeal brief filed 03 March appealing from the Office action 
mailed 23 June 2009. 
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(1) Real Party in Interest 

The examiner has no comment on the statement, or lack of statement, identifying 
by name the real party in interest in the brief. 

(2) Related Appeals and Interferences 

The examiner is not aware of any related appeals, interferences, or judicial 
proceedings which will directly affect or be directly affected by or have a bearing on the 
Board's decision in the pending appeal. 

(3) Status of Claims 

The following is a list of claims that are rejected and pending in the application: 
Claims 1, 2, 6, 7, 9, 14, and 15. 

(4) Status of Amendments After Final 

The examiner has no comment on the appellant's statement of the status of 
amendments after final rejection contained in the brief. 

(5) Summary of Claimed Subject Matter 

The examiner has no comment on the summary of claimed subject matter 
contained in the brief. 

(6) Grounds of Rejection to be Reviewed on Appeal 

The examiner has no comment on the appellant's statement of the grounds of 
rejection to be reviewed on appeal. Every ground of rejection set forth in the Office 
action from which the appeal is taken (as modified by any advisory actions) is being 
maintained by the examiner except for the grounds of rejection (if any) listed under the 



Application/Control Number: 10/736,440 Page 3 

Art Unit: 2626 

subheading "WITHDRAWN REJECTIONS." New grounds of rejection (if any) are 
provided under the subheading "NEW GROUNDS OF REJECTION." 

(7) Claims Appendix 

The examiner has no comment on the copy of the appealed claims contained in 
the Appendix to the appellant's brief. 

(8) Evidence Relied Upon 

2003/0061048 Wu et al 3-2003 

6,976,082 Ostermann et al 12-2005 

2001/0047260 Walker et al 10-2001 

(9) Grounds of Rejection 

The following ground(s) of rejection are applicable to the appealed claims: 
Claim Rejections - 35 USC § 103 

1 . The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

2. Claims 1, 2, 6, 7, 9, 14, and 15 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Wu et al (US PGPub 2003/0061048) in view of Ostermann et al 
(USPN 6,976,082) and in further view of Walker et al (US PGPub 2001/0047260). 

Claim 1: 

Wu discloses a system for generating a collection of speech generation 
commands associated with computer readable information (Abstract), comprising: 
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a first computer (network server) configured to generate a first collection of 
speech generation commands (coded speech parameters) based on a first portion of 
computer readable information (text) (see [0019]); 

the first computer in communication with a communication network and a phone 
operatively communicating with the communication network, wherein signals generated 
by the first computer are transmitted through the communication network to the phone 
("transmitting the coded speech parameters from a network server to a wireless 
communication device", [0019]). 

Wu further discloses either the phone receiving the first collection of speech 
generation commands and accessing a predetermined set of the speech samples in the 
voice file based on the first collection of speech generation commands to generate 
auditory speech ("the native coded speech parameters, corresponding to each of the 
phonics from the previous step and along with suitable spaces, are subsequently 
processed in a signal processor 208 (such as a DSP for example) to provide a 
decompressed speech signal to an audio circuit 210 of the cellular phone handset", 
[0018]) or the phone receiving signals corresponding to auditory speech and generating 
auditory speech from the received signals ("Alternatively, a network server of the 
communication system can converts this formatted text string to speech and transmit 
this speech to a conventional cellular handset over a voice channel instead of a data 
channel", [0011]). In other words, Wu discloses, either receiving textual information in 
the form of coded speech parameters and performing a text-to-speech process at the 
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phone or, performing the text-to-speech (TTS) process at a server and transmitting 
speech to the phone. 

However, Wu does not explicitly disclose determining whether the phone 
includes a voice file (i.e. is able to perform text-to-speech which require voice files for 
concatenation based TTS) and conducting the text-to-speech, either at the phone if a 
voice file is present at the phone or, at the server if a voice file is not present on the 
phone. 

In a similar network based text-to-speech system, Ostermann discloses checking 
if a phone (col. 6, lines 5-1 1 ) has speech synthesis software (which require voice files 
for concatenation based TTS) and performing the TTS at the phone if the phone has 
TTS capabilities or performing the TTS at a server and transmitting synthesized speech 
to the device from the server if the device does not have speech synthesis software 
(col. 11, lines 15-26). 

It would have been obvious to one with ordinary skill in the art at the time of the 
invention to check Wu's phone for TTS capabilities (voice files in concatenation based 
TTS) and performing TTS on the phone, if the phone has voice files, or performing the 
TTS on a server and transmitting synthesized speech, if the phone does not have TTS 
capabilities, because a phone cannot perform TTS if it does not have TTS capabilities 
(voice files in concatenation based TTS). 

Further, Wu and Ostermann do not explicitly disclose where the first computer 
receives a text to speech request signal from a phone through an email computer server 
via a communications network and generating speech command based on the request. 
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In a similar network based text-to-speech system, Walker discloses a computer 
receiving a text to speech request (Fig. 1 , item 22a and related text) through an email 
computer server (Fig. 1 , item 16, and related text. Note that this server receives and 
send electronic text messages, i.e. it's an email computer server) and generating 
speech from text in response to the request ("speech" output from item 20, Fig. 1 and 
related text). 

It would have been obvious to one with ordinary skill in the art at the time of the 
invention to have received text to speech request in Wu's first computer and generated 
Wu's speech commands based on the request in order to allow a user to request only 
desired information in real time (see Walker [0003]). 

Claim 2: 

Wu, Ostermann, and Walker disclose the system of claim 1 ; Walker further 
discloses a second computer (item 22b, Fig. 2) configured to receive the second portion 
of computer readable information from a first computer and to generate a second 
collection of speech generation commands based on the second portion of computer 
readable information (Fig. 2, item 22 and related text), the first computer is further 
configured to receive the second collection of speech generation commands from the 
second computer and to generate a third collection of speech generation commands 
based on the first and second collection of speech generating commands (Fig. 2, item 
24 and related text, [0030]); wherein the first computer generates signals based on the 
third collection of speech generation commands ("Streaming buffer 24 transmits the 
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speech segments in the proper order along with the telephony user address to voice 
application", [0031]). 

It would have been obvious to one with ordinary skill in the art at the time of the 
invention to perform, in Wu's system, the text-to-speech process using a plurality of 
engines because the resulting system "efficiently processes text documents of any size" 
(Walker, [0018]) by dividing the text into easily manageable portions. 

Claim 6: 

Wu, Ostermann, and Walker disclose the system of claim 1 , Wu further discloses 
wherein the first computer further includes a memory having a voice file stored therein, 
the voice file having a plurality of speech samples from a predetermined person, the first 
collection of speech generation commands being associated with a predetermined set 
of the plurality of speech samples (Fig. 2, element 202 and related text). 

Claim 7: 

Wu discloses a method for generating a collection of speech generation 
commands (Abstract), comprising: 

generating a first collection of speech generation commands (coded speech 
parameters) based on a first portion of computer readable information (text message) in 
a first computer (Fig. 1, step 108 and related text); 

wherein the first computer includes a memory having a voice file stored therein, 
the voice file having a plurality of speech generation commands associated with speech 
samples of a person (Fig. 2, element 202 and related text), wherein the generation of 
the first collection of speech generation commands includes: 
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generating phonetic units (phonics) associated with the first portion of computer 
readable information (text message) (Fig. 1, item 106 and related text); 

comparing a phonetic unit to phonetic units stored in the voice file (code table, 
Fig. 2, element 202 and related text) to determine a matched phonetic unit; and 
selecting a speech generation command in the voice file associated with the matched 
phonetic unit (Fig. 1, step 108 and related text). 

Wu does not explicitly disclose that the phonetic units associated with the text 
message and the phonetic units stored in the code table are composed of phonemes 
and multi-phonemes. 

However, in the Background of The Invention, Wu discloses that phonemes 
(phones) and multi-phonemes (diphones) are used as phonetic units ([0004]). 

It would have been obvious to one with ordinary skill in the art at the time of the 
invention to represent Wu's phonetic units using phonemes and multi-phonemes 
because they are well known standards in text-to-speech systems. 

Wu further discloses either the phone receiving the first collection of speech 
generation commands and accessing a predetermined set of the speech samples in the 
voice file based on the first collection of speech generation commands to generate 
auditory speech ("the native coded speech parameters, corresponding to each of the 
phonics from the previous step and along with suitable spaces, are subsequently 
processed in a signal processor 208 (such as a DSP for example) to provide a 
decompressed speech signal to an audio circuit 210 of the cellular phone handset", 
[0018]) or the phone receiving signals corresponding to auditory speech and generating 
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auditory speech from the received signals ("Alternatively, a network server of the 
communication system can converts this formatted text string to speech and transmit 
this speech to a conventional cellular handset over a voice channel instead of a data 
channel", [0011]). In other words, Wu discloses, either receiving textual information in 
the form of coded speech parameters and performing a text-to-speech process at the 
phone or, performing the text-to-speech (TTS) process at a server and transmitting 
speech to the phone. 

However, Wu does not explicitly disclose determining whether the phone 
includes a voice file (i.e. is able to perform text-to-speech which require voice files for 
concatenation based TTS) and conducting the text-to-speech, either at the phone if a 
voice file is present at the phone or, at the server if a voice file is not present on the 
phone. 

In a similar network based text-to-speech system, Ostermann discloses checking 
if a phone (col. 6, lines 5-1 1 ) has speech synthesis software (which require voice files 
for concatenation based TTS) and performing the TTS at the phone if the phone has 
TTS capabilities or performing the TTS at a server and transmitting synthesized speech 
to the device from the server if the device does not have speech synthesis software 
(col. 11, lines 15-26). 

It would have been obvious to one with ordinary skill in the art at the time of the 
invention to check Wu's phone for TTS capabilities (voice files in concatenation based 
TTS) and performing TTS on the phone, if the phone has voice files, or performing the 
TTS on a server and transmitting synthesized speech, if the phone does not have TTS 
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capabilities, because a phone cannot perform TTS if it does not have TTS capabilities 
(voice files in concatenation based TTS). 

Further, Wu and Ostermann do not explicitly disclose where the first computer 
receives a text to speech request signal from a phone through an email computer server 
via a communications network and generating speech command based on the request. 

In a similar network based text-to-speech system, Walker discloses a computer 
receiving a text to speech request (Fig. 1 , item 22a and related text) through an email 
computer server (Fig. 1 , item 16, and related text. Note that this server receives and 
send electronic text messages, i.e. it's an email computer server) and generating 
speech from text in response to the request ("speech" output from item 20, Fig. 1 and 
related text). 

It would have been obvious to one with ordinary skill in the art at the time of the 
invention to have received text to speech request in Wu's first computer and generated 
Wu's speech commands based on the request in order to allow the user to request only 
desired information in real time (see Walker [0003]). 

Claim 9: 

Wu, Ostermann, and Walker disclose the method of claim 7, Wu further discloses 
wherein the comparing of a phoneme or multi-phoneme to phonemes and multi- 
phonemes stored in the voice file to determine a matched phoneme or multi-phoneme 
includes: 

comparing a multi-phoneme to multi-phonemes stored in the voice file; and, 
comparing a phoneme to phonemes stored in the voice file ("mapping each of the 
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phonics from the audio server, by a mapping unit 206, against the code table 202 to find 
the coded speech parameters corresponding to each of the phonics", [0015]). 
Claim 14: 

Wu, Ostermann, and Walker disclose the method of claim 13, Wu further 
discloses wherein the phone includes a memory having a voice file (audio file) stored 
therein, the method further comprising accessing portions of the voice file based on the 
first collections of speech generation commands to generate auditory speech ("the 
native coded speech parameters, corresponding to each of the phonics from the 
previous step and along with suitable spaces, are subsequently processed in a signal 
processor 208 (such as a DSP for example) to provide a decompressed speech signal 
to an audio circuit 210 of the cellular phone handset", [0018]). 

Claim 15: 

Wu, Ostermann, and Walker do not explicitly disclose a computer readable 
medium encoding software for performing the steps of method claim 7. It is old and well- 
known to encode program code for performing a method on a computer readable 
medium and implement instructions corresponding to the program code on a computer's 
processor. Claim 15 is directed to a storage medium encoded with machine-readable 
computer program code for performing the method of claim 7. 

Implementing a method as software on a computer readable medium would be 
an obvious modification to one of ordinary skill in the art of speech synthesis, at the time 
of applicant's invention, so as to facilitate loading the software onto a computer to 
perform the steps listed above. 
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Accordingly, claim 15 is rejected with the same rationale as applied above with 
respect to method claim 7. 

(10) Response to Argument 
Regarding claim 1 : 

Appellants argue that Walker's voice application 16 is not an email server. The 
Examiner respectfully disagrees. Microsoft Computer Dictionary (Fifty Edition, 2002) 
defines a server as: 

On the internet or other network, a computer or program that responds to 

commands from a client. 

Walker discloses voice application 16 as an application, in a network, that 
receives a command (request) from a client (Telephony user 12) and which responds to 
the command by accessing textual data and forwarding synthesized speech of the 
textual data ([0023]). Therefore, voice application 16 is a server. Further, Walker 
discloses that the textual data may be email ([0022]). Therefore, voice application 16 is 
a server which accesses email and forwards the email in speech form, i.e. it is an email 
server. 

Regarding all other claims : 

Appellants' arguments are similar to the one above regarding claim 1 . 

(11) Related Proceeding(s) Appendix 

No decision rendered by a court or the Board is identified by the examiner in the 
Related Appeals and Interferences section of this examiner's answer. 

For the above reasons, it is believed that the rejections should be sustained. 
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Respectfully submitted, 

Samuel G. Neway /S. N./ 

/David R Hudspeth/ 

Supervisory Patent Examiner, Art Unit 2626 

/James S. Wozniak/ 

Primary Examiner, Art Unit 2626 
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